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Abstract. The problem of extracting consistent information from relational databases 
violating integrity constraints on numerical data is addressed. In particular, aggre- 
gate constraints defined as linear inequalities on aggregate-sum queries on input 
data are considered. The notion of repair as consistent set of updates at attribute- 
value level is exploited, and the characterization of several complexity issues re- 
lated to repairing data and computing consistent query answers is provided. 



1 Introduction 

Research has deeply investigated several issues related to the use of integrity constraints 
on relational databases. In this context, a great deal of attention has been devoted to the 
problem of extracting reliable information from databases containing pieces of infor- 
mation inconsistent w.r.t. some integrity constraints. All previous works in this area 
deal with "classical" forms of constraint (such as keys, foreign keys, functional depen- 
dencies), and propose different strategies for updating inconsistent data reasonably, in 
order to make it consistent by means of minimal changes. Indeed these kinds of con- 
straint often do not suffice to manage data consistency, as they cannot be used to define 
algebraic relations between stored values. In fact, this issue frequently occurs in several 
scenarios, such as scientific databases, statistical databases, and data warehouses, where 
numerical values of tuples are derivable by aggregating values stored in other tuples. 

In this work we focus our attention on databases where stored data violates a set 
of aggregate constraints, i.e. integrity constraints defined on aggregate values extracted 
from the database. These constraints are defined on numerical attributes (such as sales 
prices, costs, etc.) which represent measure values and are not intrinsically involved in 
other forms of constraints. 

Example 1. Table 1 represents a two-years cash budget for a firm, that is a summary 
of cash flows (receipts, disbursements, and cash balances) over the specified periods. 
Values 'Jef', 'aggr' and 'drv' in column Type stand for detail, aggregate and derived, 
respectively. In particular, an item of the table is aggregate if it is obtained by aggregat- 
ing items of type detail of the same section, whereas a derived item is an item whose 
value can be computed using the values of other items of any type and belonging to any 
section. 

A cash budget must satisfy these integrity constraints: 
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Year 


Section 


Subsection 


Twe 


Value 


2003 








20 


2003 


Receipts 


cash sales 


det 


100 


2003 


Receipts 


receivables 


det 


120 


2003 


Receipts 


total cash receipts 


aser 


250 


2003 


Disbursements 


payment of accounts 


det 


120 


2003 


Disbursements 


capital expenditure 


det 





2003 


Disbursements 


long-term financing 


det 


40 


2003 


Disbursements 


total disbursements 


aser 


160 


2003 


Balance 


net cash inflow 


drv 


60 


2003 


Balance 


ending cash balance 


drv 


80 


2004 


Receipts 


beginning cash 


drv 


80 


2004 


Receipts 


cash sales 


det 


100 


2004 


Receipts 


receivables 


det 


100 


2004 


l^cccipls 


lolal cash receipts 


aggr 


200 


2004 


Disbursements 


payment of accounts 


det 


130 


2004 


Disbursements 


capital expenditure 


det 


40 


2004 


Disburscnicnls 


long-lcrni financing 


del 


20 


2004 


Disbursements 


total disbursements 


aggr 


190 


2004 


Balance 


net cash inflow 


drv 


10 


2004 


Balance 


ending cash balance 


drv 


90 



Table 1. A cash budget 



1 . for each section and year, the sum of the values of all detail items must be equal to 
the value of the aggregate item of the same section and year; 

2. for each year, the net cash inflow must be equal to the difference between total cash 
receipts and total disbursements; 

3. for each year, the ending cash balance must be equal to the sum of the beginning cash 
and the net cash balance. 

Table 1 was acquired by means of an OCR tool from two paper documents, reporting 
the cash budget for 2003 and 2004. The original paper document was consistent, but 
some symbol recognition errors occurred during the digitizing phase, as constraints 1) 
and 2) are not satisfied on the acquired data for year 2003, that is: 

i) in section Receipts, the aggregate value of total cash receipts is not equal to the sum 
of detail values of the same section. 

ii) the value of net cash inflow is not to equal the difference between total cash receipts 
and total disbursements. 

In order to exploit the digital version of the cash budget, a fundamental issue is to 
define a reasonable strategy for locating OCR errors, and then "repairing" the acquired 
data to extract reliable information. 
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Most of well-known techniques for repairing data violating either key constraints 
or functional dependencies accomplish this task by performing deletions and insertions 
of tuples. Indeed this approach is not suitable for contexts analogous to that of Ex- 
ample 1, that is of data acquired by OCR tools from paper documents. For instance, 
repairing Table 1 by either adding or removing rows means hypothesizing that the OCR 
tool either jumped a row or "invented" it when acquiring the source paper document, 
which is rather unrealistic. The same issue arises in other scenarios dealing with nu- 
merical data representing pieces of information acquired automatically, such as sensor 
networks. In a sensor network with error-free cormnunication channels, no reading gen- 
erated by sensors can be lost, thus repairing the database by adding new readings (as 
well as removing collected ones) is of no sense. In this kind of scenario, the most natural 
approach to data repairing is updating directly the numerical data: this means working 
at attribute-level, rather than at tuple-level. For instance, in the case of Example 1, we 
can reasonably assume that inconsistencies of digitized data are due to symbol recogni- 
tion errors, and thus trying to re-construct actual data values is well founded. Likewise, 
in the case of sensor readings violating aggregate constraints, we can hypothesize that 
inconsistency is due to some trouble occurred at a sensor while generating some read- 
ing, thus repairing data by modifying readings instead of deleting (or inserting) them is 
justified. 

1.1 Related Work 

First theoretical approaches to the problem of deahng with incomplete and inconsistent 
information date back to 80s, but these works mainly focus on issues related to the se- 
mantics of incompleteness [12]. The problem of extracting reliable information from 
inconsistent data was first addressed in [4], where an extension of relational algebra 
(namely flexible algebra) was proposed to evaluate queries on data inconsistent w.r.t. 
key constraints (i.e. tuples having the same values for key attributes, but conflicting 
values for other attributes). The first proof-theoretic notion of consistent query answer 
was introduced in [6], expressing the idea that tuples involved in an integrity viola- 
tion should not be considered in the evaluation of consistent query answering. In [1] 
a different notion of consistent answer was introduced, based on the notion of repair: 
a repair of an inconsistent database D is a database D' satisfying the given integrity 
constraints and which is minimally different from D. Thus, the consistent answer of a 
query q posed on D is the answer which is in every result of q on each repair D' . In 
particular, in [1] the authors show that, for restricted classes of queries and constraints, 
consistent answers can be evaluated without computing repairs, but by looking only at 
the specified constraints and rewriting the original query q into a query q' such that the 
answer of q' on D is equal to the consistent answer of q on D. Based on the notions of 
repair and consistent query answer introduced in [1], several works investigated more 
expressive classes of queries and constraints. In [2] extended disjunctive logic programs 
with exceptions were used for the computation of repairs, and in [3] the evaluation of 
aggregate queries on inconsistent data was investigated. A further generalization was 
proposed in [11], where the authors defined a technique based on the rewriting of con- 
straints into extended disjunctive rules with two different forms of negation (negation 
as failure and classical negation). This technique was shown to be sound and complete 
for universally quantified constraints. 



4 



S. Flesca, F. Furfaro, F. Parisi 



All the above-cited approaches assume that tuple insertions and deletions are the 
basic primitives for repairing inconsistent data. More recently, in [9] a repairing strategy 
using only tuple deletions was proposed, and in [17] repairs also consisting of update 
operations were considered. The latter is the first approach performing repairs at the 
attribute-value level, but is not well-suited in our context, as it works only in the case 
that constraints consist of full dependencies. 

The first work investigating aggregate constraints on numerical data is [16], where 
the consistency problem of very general forms of aggregation is considered, but no is- 
sue related to data-repairing is investigated. In [5] the problem of repairing databases 
by fixing numerical data at attribute level is investigated. The authors show that decid- 
ing the existence of a repair under both denial constraints (where built-in comparison 
predicates are allowed) and a non-Unear form of multi-attribute aggregate constraints 
is undecidable. Then they disregard aggregate constraints and focus on the problem of 
repairing data violating denial constraints, where no form of aggregation is allowed in 
the adopted constraints. 

1.2 Main Contribution 

We investigate the problem of repairing and extracting reliable information from data 
violating a given set of aggregate constraints. These constraints consist of Unear in- 
equalities on aggregate-sum queries issued on measure values stored in the database. 
This syntactic form enables meaningful constraints to be expressed, such as those of 
Example 1 as well as other forms which often occur in practice. 

We consider database repairs consisting of "reasonable" sets of value-update op- 
erations aiming at re-constructing the correct measure values of inconsistent data. We 
adopt two different criteria for determining whether a set of update operations repairing 
data can be considered "reasonable" or not: seMninimal semantics and cani-minimal 
semantics. Both these semantics aim at preserving the information represented in the 
source data as much as possible. They correspond to different repairing strategies which 
turn out to be well-suited for different application scenarios. 

We provide the complexity characterization of three fundamental problems: i) re- 
pairability (is there at least one repair for the given database w.r.t. the specified con- 
straints?); ii) repair checking (given a set of update operations, is it a "reasonable" re- 
pair?); iii) consistent query answer (is a given boolean query true in every "reasonable" 
repair?). 

2 Preliminaries 

We assume classical notions of database scheme, relational scheme, and relations. In 
the following we will also use a logical formalism to represent relational databases, 
and relational schemes will be represented by means of sorted predicates of the form 
R{Ai : Ai,. . . ,An: An), where Ai, . . . , ^„ are attribute names and Ax,. . . , An are 
the corresponding domains. Each Ai can be either Z (infinite domain of integers), R 
(reals), or § (strings). Domains R and Z will be said to be numerical domains, and 
attributes defined over R or Z will be said to be numerical attributes. Given a ground 
atom t denoting a tuple, the value of attribute A oft will be denoted as t[A\. 
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Given a database scheme V, we will denote as A^p (namely, Measure attributes) 
the set of numerical attributes representing measure data. That is, M-d specifies the 
set of attributes representing measure values, such as weights, lengths, prices, etc. For 
instance, in Example 1, M-d consists of the only attribute Value. 

Given two sets M, M', MAM' denotes their symmetric difference (M U M') \ 

(MnM'). 



2.1 Aggregate constraints 

Given a relational scheme R{Ai : Ai,. . 
defined recursively as follows: 



. ,An : An), an attribute expression on R is 



- a numerical constant is an attribute expression; 

- each Ai (with i G [l..n]) is an attribute expression; 

- eiV'e2 is an attribute expression on R, if e\, 62 are attribute expressions on R and ip is 
an arithmetic operator in {+,—}; 

- c X (e) is an attribute expressions on R, if e is an attribute expression on R and c a 
numerical constant. 

Let i? be a relational scheme, e an attribute expression on R, and C a boolean 

formula on constants and attributes of R. An aggregation function on i? is a function 
X : {Ai X • • • X Z\/j) ^ M, where Ai, . . . ,Ak are the relational domains of some 
attributes Ai, . . . ,Ak of R. xi^i ,Xk)is defined as follows: 
X{xi, ■ ■ ■ , Xk) = SELECT suin(e) 
FROM R 

WHERE a(xi, . . . ,xi,) 
where Q!(xi, . . . , Xfc) = C A {Ai=xi) A •••A {Ak=Xk)- 

Example 2. The following aggregation functions are defined on the relational scheme 
CashBudget(Year, Section, Subsection, Type, Value) of Example 1: 

Xlix, y, z) = SELECT sum(Value) X2ix, y) = SELECT sum(Value) 
FROM CashBudget FROM CashBudget 

WHERE Section = x WHERE Year = x 

AND Year = y AND Type = z AND Subsection = y 

Function xi returns the sum of Value of all the tuples having Section x, Year y and 
Type z. For instance, xi ('Receipts', '2003', 'det') returns 100 + 120 = 220, whereas 
Xi ('Disbursements', '2003', 'aggr') returns 160. Function X2 returns the sum of Value of 
all the tuples where Year=x and Subsection=y. In our running example, as the pair 
Year, Subsection uniquely identifies tuples of CashBudget, the sum returned by X2 co- 
incides with a single value. For instance, X2('2003', 'cash sales') returns 100, whereas 
X2('2004', 'net cash inflow') returns 10. 

Definition 1 (Aggregate constraint). Given a database scheme V, an aggregate con- 
straint on V is an expression of the form: 

yxi,...,Xk ^{xi,...,Xk) =^ Y^Ci ■ Xi{Xi) < (1) 

where: 
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1. ci, . . . ,Cn,K are constants; 

2. (j){xi , . . . ,Xk) is a conjunction of atoms containing the variables xi , . . . , Xk; 

3. each Xii^i) >•'' '^w aggregation function, where is a list of variables and con- 
stants, and variables appearing in Xi are a subset of {xi, . . . ,Xk}. 

Given a database D and a set of aggregate constraints AC, we will use the notation 
D \= AC [resp. D ^ AC] to say that D is consistent [resp. inconsistent] w.r.t. AC. 
Observe that aggregate constraints enable equalities to be expressed as well, since an 
equahty can be viewed as a pair of inequahties. For the sake of brevity, in the following 
equalities will be written expUcitly. 

Example 3. Constraint 1 defined in Example 1 can be expressed as follows: 
'\/x,y,s,t,v CashBudget{y,x,s,t,v) xi(a;,y, 'det') - xi(a;,y, 'aggr') = 

For the sake of simplicity, in the following we will use a shorter notation for denoting 
aggregate constraints, where universal quantification is implied and variables in (j) which 
do not occur in any aggregation function are replaced with the symbol '_' . For instance, 
the constraint of Example 3 can be written as follows: 
CashBudget{y,x, =^ xi{x,y/def) - xi{x,y,'aggr') = 

Example 4. Constraints 2 and 3 defined in Example 1 can be expressed as follows: 
Constraint 2: CashBudget(x, _,_,_,_) ^> 

X^ix, 'net cash inflow') — {x2{x, 'total cash receipts') — X'i(x, 'total disbursements')) = 

Constraint 3: CashBudget{x,. =^ 

X2{x, 'ending cash balance') — (x^ix, 'beginning cash') + x^ix, 'net cash balance')) = 

Consider the database scheme consisting of relation CashBudget and relation Sales( 
Product, Year, Income), containing pieces of information on annual product sales. The 
following aggregate constraint says that, for each year, the value of cash sales in Cash- 
Budget must be equal to the total incomes obtained from relation Sales: 
CashBudget (a;, _,_,_,_) A Sales(_ , x, _) ==> x^ix, 'cash sales') — xai^) = 
where Xs (2^) is the aggregation function returning the total income due to products sales 
in year x: 

X3{x) = SELECT sum(lncome) 
FROM Sales 
WHERE Year = x 

2.2 Updates 

Updates at attribute-level will be used in the following as the basic primitives for repair- 
ing data violating aggregate constraints. Given a relational scheme R in the database 
scheme V, let Mr = {^1, . . . , Ak} be the subset of Mt> containing all the attributes 
in R belonging to A4v- 

Definition 2 (Atomic update). Let t = R{vi, . . . ,f„) be a tuple on the relational 
scheme R{Ax : Ax, . . . , An : An). An atomic update on t is a triplet < t, Ai, v[ >, 
where Ai G A4r and v[ is a value in Ai and v'i^ Vi. 
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Update u =< t,Ai,v^ > replaces t[Ai] with v'^, thus yielding the tuple u{t) = 

R{vi, t/, Uj+i, ...,Vn). 

Observe that atomic updates work on the set A^ij of measure attributes, as our 
framework is based on the assumption that data inconsistency is due to errors in the 
acquisition phase (as in the case of digitization of paper documents) or in the mea- 
surement phase (as in the case of sensor readings). Therefore our approach will only 
consider repairs aiming at re-constructing the correct measures. 

Example 5. Update u —< t, Value, 130 > issued on tuple t = CashBudget(2003, 
Receipts, cash sales, det, 100) returns u{t) = CashBudget(2003, Receipts, cash sales, 
det, 130). 

Given an update u, we denote the attribute updated by u as X{u). That is, if m = 
< t, Ai, V > then X{u) =< t, Ai >. 

Definition 3 (Consistent database update). Let Dbea database and U = {ui ,...,«„} 
be a set of atomic updates on tuples ofD. The set U is said to be a consistent database 
update iff y j,k € [l..n] ifj^kthen X{uj) ^ A(ufe). 

Informally, a set of atomic updates C/ is a consistent database update iff for each 
pair of updates wi, U2 € U, ui and U2 do not work on the same tuples, or they change 
different attributes of the same tuple. 

The set of pairs < tuple, attribute > updated by a consistent database update U will 
be denoted as A(C/) = U„.g[/A(wj). 

Given a database D and a consistent database update U, the result of performing U 
on D consists in the new database U{D) obtained by performing all atomic updates in 
U. 

3 Repairing inconsistent databases 

Definition 4 (Repair). Let V be a database scheme, AC a set of aggregate constraints 
on V, and D an instance of V such that D y= AC. A repair p for D is a consistent 
database update such that p{D) \= AC. 

Example 6. A repair p for CashBudget w.r.t. constraints 1), 2) and 3) consists in de- 
creasing attribute Value in the tuple t — CashBudget(2003, Receipts, total cash receipts, 
aggr, 250) down to 220; that is, p = { < t, Value, 220 > }. 

We now characterize the complexity of the repair-existence problem. All the com- 
plexity results in the paper refer to data-complexity, that is the size of the constraints is 
assumed to be bounded by a constant. 

The following lemma is a preliminary result which states that potential repairs for 
an inconsistent database can be found among set of updates whose size is polynomially 
bounded by the size of the original database. 

Lemma 1. Let T) he a database scheme, AC a set of aggregate constraints on V, and 
D an instance ofV such that D ^ AC. If there is a repair p for D w.r.t. AC, then there 
is a repair p' for D such that A(p') C A(p) and p' has polynomial size w.r.t. D. 
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Proof, (sketch) W.l.o.g. we assume that the attribute expression e^^ occurring in each 
aggregate function Xi in -AC is either an attribute or a constant. Let p be a repair for D, 
and AC* be the set of inequahties obtained as follows: 

1. a variable xt^A is associated to each pair <t,A>& ^{p)\ 

2. for every constraint in AC of the form (1) and for every ground substitution 6 of 
Xi,. . .,Xk s.t. (I){6x\ , Oxk) is true, the following inequahties are added to AC* : 

a. Er=i Ci • E<t,e^,> eA(p) AtNa.(exi,...,ex.) ^t.ex. ^ ^' ' ^herc K' is K minus the 
contribution to the left-hand side of the constraint due to values which have not been 
changed by p, i.e. K' = K - ^,"=1 c, • E<t,e^.> ^x{p) At^a,{e^^,...,e^,) 

b. for each tuple t such that t \= ai(9xi, . . . , 0Xk), let a' be the disjunctive normal 
form of ttj and let /3 be a disjunct in a- such that t ^ (i{9x\, . . . , 9xk). For each 
conjunct 7 in /J of the form t^i o W2, where o is a comparison operator, and either 
Wi or W2 is an attribute A such that <t^A>€ X{p), the constraint vi o W2 is added 
to AC*, where, for j e {1,2} if is constant, vj = Wj; if Wj = A and 
<t,A>G X{p), Vj = Xt,A\ if Wj = A and < ^ >^ A(/>), Vj = t[A]. 

Obviously AC* has one solution, which corresponds to assigning to each variable 
xt,Ai the value assigned by p to attribute Ai of tuple t. Moreover, the number of vari- 
ables and equations, and the size of constants in AC* are polynomially bounded by the 
size of D. Therefore there is a solution X to AC* whose size is polynomially bounded 
by the size of D, since AC* is a PLI problem with at least one solution [14]. X defines 
a repair p' for D such that \{p') C A(p) and p' has polynomial size w.r.t. D. □ 

Theorem 1 (Repair existence). Let V be a database scheme, AC a set of aggregate 
constraints on V, and D an instance ofT) such that D y= AC. The problem of deciding 
whether there is a repair for D is NP-complete. 

Proof. Membership. A polynomial size witness for deciding the existence of a repair 
is a database update U on D: testing whether [/ is a repair for D means verifying 
U{D) \= AC, which can be accomplished in polynomial time w.r.t. the size of D and 
J7. If a repair exists for D, then Lenmia 1 guarantees that a polynomial size repair for 
D exists too. 

Hardness. We show a reduction from circuit sat to our problem. Without loss of 
generality, we consider a boolean circuit C using only NOR gates. The inputs of C will 
be denoted The boolean circuit C can be represented by means of the 

database scheme: 

qatej lDGate . norVal, orVal), 
gatelnput{l DGate, IDIngoing, Val), 
input{ID Input, Val). 
Therein: 

1 . each gate in C corresponds to a tuple in gate (attributes norVal and orVal represent 
the output of the corresponding NOR gate and its negation, respectively); 

2. inputs of C correspond to tuples of input: attribute Val in a tuple of input represents 
the truth assignment to the input xioinput ; 

3. each tuple in gatelnput represents an input of the gate identified by IDGate. In 
particular, IDIngoing refers to either a gate identifier or an input identifier; attribute 
Val is a copy of the truth value of the specified ingoing gate or input. 
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We consider the database instance D where the relations defined above are pop- 
ulated as follows. For each input Xi in C we insert the tuple input(id{xi), —1) into 
D, and for each gate g in C we insert the tuple gate{id{g), —1, —1), where function 
id{x) assigns a unique identifier to its argument (we assume that gate identifiers are 
distinct from input identifiers, and that the output gate of C is assigned the identifier 
0). Moreover, for each edge in C going from g' to the gate g (where g' is either a gate 
or an input of C), the tuple gatelnput{id{g) , id{g') , —1) is inserted into D. Assume 
that M gate = {norVal.orVal}, MgaLeinpuL = {Val}, Mmpui = {Val}. In the 
following, we will define aggregate constraints to force measure attributes of all tuples 
to be assigned either 1 or 0, representing the truth value true and false, respectively. 
The initial assignment (where every measure attribute is set to —1) means that the truth 
values of inputs and gate outputs is undefined. 

Consider the following aggregation functions: 

NORVal{X) = SELECT Sum(norVal) ORVal{X) = SELECT Sum(orVal) 

FROM gate FROM gate 

WHERE (iDGate = X) WHERE (iDGate = X) 



IngoingVal{X,Y) = SELECT Sum(Val) 
FROM gatelnput 
WHERE (IDGate = X) 
AND (lDIngoing = Y) 

InputVal{X) = SELECT Sum(Val) 
FROM Input 
WHERE (lDInput = X) 



IngoingSum{X) = SELECT Sum(Val) 
FROM gatelnput 
WHERE (IDGate = X) 

Validlnput{ ) = SELECT Suiii(l) 
FROM input 
WHERE (Val / 0) 
AND (Val 7^ 1) 



ValidGate{ ) = SELECT Sum(l) 
FROM gate 

WHERE (orValT^ AND orVal/ 1) 
OR (norVal^ AND norVal/ 1) 

Therein: NORVal{X) and ORVal{X) return the truth value of the gate X and its 
opposite, respectively; IngoingVal{X, Y) returns, for the gate with identifier X, the 
truth value of the ingoing gate or input having identifier Y; IngoingSum{X) returns 
the sum of the truth values of the inputs of the gate X; InputVal{X) returns the truth 
assignment of the input X; Validlnput{ ) returns iff there is no tuple in relation 
input where attribute Val is neither nor 1, otherwise it returns a number greater than 
0; likewise, ValidGate{ ) returns iff there is no tuple in relation gate where attributes 
norVal or orVal are neither nor 1 (otherwise it returns a number greater than 0). 

Consider the following aggregate constraints on V: 

1 . Validlnput{ )+ValidGate{ ) = 0, which entails that only and 1 can be assigned 
either to attributes orVal and norVal in relation gate, and to attribute Val in 
relation input; 

2. gate{X, _, _) =^ ORVal{X) + NORVal{X) = 1, which says that for each tuple 
representing a NOR gate, the value of orVal must be complementary to norVal; 
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3. gate{X, _, _) ORVal{X) — IngoingSum{X) < 0, which says that for each 
tuple representing a NOR gate, the value of orVal cannot be greater than the sum 
of truth assignments of its inputs (i.e. if all inputs are 0, orVal must be too); 

4. gateInput{X,Y,_) IngoingVal{X,Y)-ORVal{X) < 0, which implies that, 
for each gate g, attribute orVal must be 1 if at least one input of g has value 1; 

5. gateInput{X, Y, _) ^ IngoingVal{X, Y) - NORVal{Y) - InputVal{Y) = 0, 
which imposes that the attribute Val in each tuple of gatelnput is the same as the 
truth value of either the ingoing gate or the ingoing input. 

Observe that D does not satisfy these constraints, but every repair of D corresponds 
to a valid truth assignment of C. 

Let AC be the set of aggregate constraints consisting of constraints 1-5 defined 
above plus constraint NORVal{0) = 1 (which imposes that the truth value of the out- 
put gate must be true). Therefore, deciding whether there is a truth assignment which 
evaluates C to true is equivalent to asking whether if there is a repair p for D w.r.t. 
AC. □ 

Remark. Theorem 1 states that the repair existence problem is decidable. This result, 
together with the practical usefulness of the considered class of constraints, makes the 
complexity analysis of finding consistent answers on inconsistent data interesting. Ba- 
sically decidability results from the linear nature of the considered constraints. If prod- 
ucts between two attributes were allowed as attribute expressions, the repair-existence 
problem would be undecidable (this can be proved straightforwardly, since this form 
of non-linear constraints is more expressive than those introduced in [5], where the 
corresponding repair-existence problem was shown to be undecidable). However, ob- 
serve that occurrences of products of the form Ai x Aj in attribute expressions can lead 
to undecidability only if both Ai and Aj are measure attribute. Otherwise, this case 
is equivalent to products of the form c x A, which can be expressed in our form of 
aggregate constraints. 

3.1 Minimal repairs 

Theorem 1 deals with the problem of deciding whether a database D violating a set of 
aggregate constraints AC can be repaired. If this is the case, different repairs can be 
performed on D yielding a new database consistent w.r.t. AC, although not all of them 
can be considered "reasonable". For instance, if a repair exists for D changing only 
one value in one tuple of D, any repair updating all values in all tuples of D can be 
reasonably disregarded. To evaluate whether a repair should be considered "relevant" 
or not, we introduce two different ordering criteria on repairs, corresponding to the 
comparison operators '<set' and '<card ■ The former compares two repairs by evaluating 
whether one of the two performs a subset of the updates of the other. That is, given two 
repairs pi, p2, we say that pi precedes p2 (pi <sei P2) iff Mpi) Q M.P2)- The latter 
ordering criterion states that a repair pi is preferred w.r.t. a repair p2 (pi <card P2) iff 
< |A(/92)|> that is if the number of changes issued by pi is less than p2. 
Observe that pi<sei P2 implies pi<card P2, but the vice versa does not hold, as it can 
be the case that repair pi changes a set of values A(pi) which is not subset of X(p2), 
but having cardinaUty less than X{p2)- 
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Example?. Another repair for Cas/iSwJ^ef is p' = {(ti, VaZae, 130), (t2, VaW, 70), 
(^3, Value, 190)}, where ti = CashBudget( 2003, Receipts, cash sales, det, 100), t2 = 
CashBudget( 2003, Disbursements, long-term financing, det, 40), andfs = CashBudget 
( 2003, Disbursements, total disbursements, aggr, 160). 

Observe that p <card p\ but not p <set p' (where p is the repair defined in Example 6). 

Defuiition 5 (Minimal repairs). Let V be a database scheme, AC a set of aggregate 
constraints on V, and D an instance ofD. A repair p for D w.r.t. AC Is a set-mlnlmal 
repair [resp. card-minimal repair] Iff there Is no repair p' for D w.r.t. AC such that 

P' <set p[resp. p' <card P]- 

Example 8. Repair p of Example 6 is minimal under both the ie^-minimal and the card- 
minimal semantics, whereas p' defined in Example 7 is minimal only under the ie?- 
tninimal semantics. 

Consider the repair p" consisting of the following updates: p" = {{ti, Value, 110), 
(^2, Value, 110), (<3, Value, 220)} where: ti ~ CashBudget( 2003, Receipts, cash sales, 
det, 100), t2 = CashBudget( 2003, Receipts, receivables, det, 120), ts = CashBudget( 
2003, Receipts, total cash receipts, aggr, 250). 

The strategy adopted by p" can be reasonably disregarded, since the only atomic update 
on tuple t3 suffices to make D consistent. In fact, p" is not minimal neither under the 
ie^-minimal semantics ( as A(p) C A(p") and thus p<setp") nor under the carJ-minimal 
one. 

Given a database D which is not consistent w.r.t. a set of aggregate constraints 
AC, different sef-minimal repairs (resp. card-minimal repairs) can exist on D. In our 
running example, repair p of Example 6 is the unique cani-minimal repair, and both p 
and p' are .sef-minimal repairs (where p' is the repair defined in Example 7). The set of 
ie/-minimal repairs and the set of card-minimal repairs will be denoted, respectively, 
as p^g and p^''''. 

Theorem 2 (Minimal-repair checking). Let V be a database scheme, AC a set of 

aggregate constraints on T), and D be an Instance ofD such that D ^ AC. Given a 
repair p for D w.r.t. AC, deciding whether p Is minimal (under both the caid-mlnlmallty 
and set-mlnlmallty semantics) Is coNP-complete. 

Proof. (Membership) A polynomial size witness for the complement of the problem 
of deciding whether p e p^^ [resp. p G Pm''] ^ repair p' such that p' <set P [resp. 
p' <card p\- From Letmna 1 we have that p' can be found among repairs having poly- 
nomial size w.r.t. D. 

(Hardness) We show a reduction of MINIMAL MODEL CHECKING (MMC) [7] to our 
problem. Consider an instance (/, M) of mmc, where / is a propositional formula 
and M a model for /. Formula / can be translated into an equivalent boolean cir- 
cuit C using only NOR gates, and C can be represented as shown in the hardness 
proof of Theorem 1. Therefore, we consider the same database scheme V and the same 
set of aggregate constraints AC on V as those in the proof of Theorem 1. Let D be 
the instance of T) constructed as follows. For each input xi in C we insert the tuple 
input(id(xi), 0) into D. Then, as for the construction in the hardness proof of Theo- 
rem 1, for each gate 5 in C we insert the tuple gate{id{g), —1,-1) into D, and for each 
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edge in C going from g' to the gate g (where g' is either a gate or an input of C), the 
tuple gatelnput{id{g), id{g'), —1) is inserted into D. 

Observe that any repair for D must update all measure attributes in D with value — 1 . 
Therefore, given two repairs p', p", it holds that for each <t,A>G (A(p') A X{p")), 
t is a tuple of input and A = V al. 

Obviously, a repair p for D exists, consisting of the following updates: 1) attribute 
Val is assigned 1 in every tuple of input corresponding to an atom in / which is true in 
m; 2) attributes norVal, orVal in gate and Val in gatelnput are updated accordingly 
to updates described above. Basically, such a constructed repair p corresponds to M 
(we say that a repair corresponds to a model if it assigns 1 to attribute Val in the tuples 
of input corresponding to the atoms which are true in the model, otherwise). 

If M is not a minimal model for /, then there exists a model M' such that M' c M 
(i.e. atoms which are true in M' are a proper subset of atoms which are true in M). 
Then, the repair p' corresponding to M' satisfies p' <set P- Vice versa, if there exists a 
repair p' such that p' <set P, then the model M' corresponding to p' is a proper subset 
of M, thus M is not minimal. This proves that M is a minimal model for / iff p is a 
minimal repair (under .vef-minimal semantics) for D w.r.t AC. 

Proving hardness under card-minimal semantics can be accomplished as follows. 
First, a formula /m is constructed from / by replacing, for each atom a ^ M, each 
occurrence of a in / with the contradiction (a A ^a). Then, an instance _D of 2? is 
constructed corresponding to formula /m with the same value assignments as before 
(attribute Val in all the tuples of input are set to 0, and all the other measure attributes 
are set to —1). 

M is a model for both / and /m, and it is minimal for / iff it is minimum for 
/m- In fact, if M is minimal for / there is no subset AI' of M which is a model of 
/. Then, assume that a model M" for /m exists, such that \M"\ < \M\. Then, also 
M'" = M" n M is a model for /m, implying that M'" is a model for /, which is a 
contradiction (as M'" C M). On the other hand, if M is minimum for Jm then M 
must be minimal for /. Otherwise, there would exist a model M' for / s.t. M' c M. 
However M' is also a model for /m, which is a contradiction, as \M'\ < \M\. 

Let p be the repair of D w.r.t. AC corresponding to M. If M is not minimum, then 
there exists M' (with \M'\ < \M\) which is a model for /m. Therefore the repair p' 
corresponding to M' satisfies p' <card P- Vice versa, if a repair p' for D w.r.t. AC exists 
such that p' <card p, then the model M' corresponding to p' is such that \M'\ < \M\, 
thus M is not minimum for Jm- This proves that M is a minimal model for / iff there 
is no repair p' for D w.r.t. AC such that p' <card P- □ 

Set-minimality vs card-minimality 

Basically, both the sef-minimal and the carcZ-minimal semantics aim at considering 
"reasonable" repairs which preserve the content of the input database as much as pos- 
sible. To the best of our knowledge the notion of repair minimaUty based on the num- 
ber of performed updates has not been used in the context of relational data violating 
"non-numerical" constraints (such as keys, foreign keys, and functional dependencies). 
In this context, most of the proposed approaches consider repairs consisting of dele- 
tions and insertions of tuples, and preferred repairs are those consisting of minimal sets 
of insert/delete operations. In fact, the sef-minimal semantics is more natural than the 
cani-minimal one when no hypothesis can be reasonably formulated to "guess" how 
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data inconsistency occurred, which is the case of previous works on database-repairing. 
As it will be clear in the following, in the general case, the adoption of the carJ-minimal 
semantics could make reasonable sets of delete/insert operations to be not considered 
as candidate repairs, even if they correspond to error configurations which cannot be 
excluded. 

For instance, consider a relational scheme Department(Name, Area, Employers, 
Category) where the following functional dependencies are defined: FDi : Area — > 
Employers (i.e. departments having the same area must have the same number of em- 
ployers) and FD2 : Employers — » Category (i.e. departments with the same number 
of employers must be of the same category). Consider the following relation: 



Department 


Area 


Employers 


Category 


Di 


100 


24 


A 


D2 


100 


30 


B 


D3 


100 


30 


B 



Relation above does not satisfy FDi, as the three departments occupy the same 
area but do not have the same number of employers. Suppose we are using a repairing 
strategy based on deletions and insertions of tuples. Different repairs can be adopted. 
For instance, if we suppose that the inconsistency arises as tuple ti contains wrong in- 
formation, Department can be repaired by only deleting ti. Otherwise, if we assume 
that tl is correct, a possible repair consists of deleting t2 and t^. If the cani-minimal se- 
mantics is adopted, the latter strategy will be disregarded, as it performs two deletions, 
whereas the former deletes only one tuple. On the contrary, if the ie?-minimal semantics 
is adopted, both the two strategies define minimal repairs (as the sets of tuples deleted 
by each of these strategies are not subsets of one another). In fact, if we do not know 
how the error occurred, there is no reason to assume that the error configuration corre- 
sponding to the second repairing strategy is not possible. Indeed, inconsistency could 
be due to integrating data coming from different sources, where some sources are not 
up-to-date. However, there is no good reason to assume that the source which contains 
the smallest number of tuples is the one that is up to date. See [13] for a survey on 
inconsistency due to data integration. 

Likewise, the card-mimm&l semantics could disregard reasonable repairs also in 
the case that a repairing strategy based on updating values instead of deleting/inserting 
whole tuples is adopted ^ For instance, if we suppose that the inconsistency arises as 
the value of attribute Area is wrong for either ti or both t2 and t^. Department can be 
repaired by replacing the Area value for either 1 1 or both ^2 and with a value different 
from 100. Otherwise, if we assume that the Area values for all the tuples are correct. 
Department can be repaired w.r.t. FDi by making the Employers value of ti equal to 
that of t2 and t^. Indeed this update yields a relation which does not satisfy FD2 (as 
t\ [Category] ^ t2 [Category]) so that another value update is necessary in order to make 
it consistent. Under the carJ-minimal semantics the latter strategy is disregarded, as it 
performs more than one value update, whereas the former changes only the Area value 
of one tuple. On the contrary, under the ie?-minimal semantics both the two strategies 

' Value updates cannot be necessarily simulated as a sequence deletion/insertion, as this might 
not be minimal under set inclusion. 
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define minimal repairs (as the sets of updates issued by each of these strategies are 
not subsets of one another). As for the case explained above, disregarding the second 
repairing strategy is arbitrary, if we do not know how the error occurred. 

Our framework addresses scenarios where also carJ-minimal semantics can be rea- 
sonable. For instance, if we assume that integrity violations are generated while acquir- 
ing data by means of an automatic or semi-automatic system (e.g. an OCR digitizing 
a paper document, a sensor monitoring atmospheric conditions, etc.), focusing on er- 
ror configurations which can be repaired with the minimum number of updates is well 
founded. Indeed this corresponds to the case that the acquiring system made the mini- 
mum number of errors (e.g. bad symbol-recognition for an OCR, sensor troubles, etc.), 
which can be considered the most probable event. 

In this work we discuss the existence of repairs, and their computation under both 
cani-minimal and se?-minimal semantics. The latter has to be preferred when no war- 
ranty is given on the accuracy of acquiring tools, and, more generally, when no hypoth- 
esis can be formulated on the cause of errors. 

3.2 Consistent query answers 

In this section we address the problem of extracting reliable information from data 
violating a given set of aggregate constraints. We consider boolean queries checking 
whether a given tuple belongs to a database, and adopt the widely-used notion of con- 
sistent query answer introduced in [1]. 

Deftnition 6 (Query). A query over a database scheme V is a ground atom of the form 
R{v\, . . . , Vn), where R{Ai, . . . , An) is a relational scheme in V. 

Definition 7 (Consistent query answer). Let V be a database scheme, D be an in- 
stance ofV, AC be a set of aggregate constraints on V and q be a query over V. The 
consistent query answer ofq on D under the set-minimal semantics [ resp. caid-minimal 
semantics] is true iffqG p{D) for each p G ff^ [ resp. for each p G p'^m'^]- 

The consistent query answers of a query q issued on the database D under the 
ie^-minimal and card-minimal semantics will be denoted as q^^*{D) and q'^°'^'^{D), 
respectively. 

Theorem 3 (Consistent query answer under £e^minimal semantics). Let V be a 

database scheme, D be an instance of V, AC be a set of aggregate constraints on T> 
and qbe a query over D. Deciding whether q^^* (D) = true is n2-complete. 

Proof. See appendix. □ 

Tlieorem 4 (Consistent query answer under card-ndmmal semantics). Let V be a 

database scheme, D be an instance ofV, AC be a set of aggregate constraints on T> 
and qbe a query over D. Deciding whether q'^°-^'^[D) = true is A2[log n]-complete. 



Proof. See appendix. 



□ 
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Conclusions and Future Work 

We have addressed the problem of repairing and extracting rehable information from 
numerical databases violating aggregate constraints, thus filling a gap in previous works 
dealing with inconsistent data, where only traditional forms of constraints (such as 
keys, foreign keys, etc.) were considered. In fact, aggregate constraints frequently oc- 
cur in many real-life scenarios where guaranteeing the consistency of numerical data 
is mandatory. In particular, we have considered aggregate constraints defined as sets of 
hnear inequalities on aggregate-sum queries on input data. For this class of constraints 
we have characterized the complexity of several issues related to the computation of 
consistent query answers. 

Future work wiU be devoted to the identification of decidable cases when more 
expressive forms of constraint are adopted, that allow products between attribute values 
(as explained in the paper, enabling non-linear forms of aggregate expressions makes 
the repair-existence problem undecidable in the general case). Moreover the design of 
efficient algorithms for computing consistent answers will be addressed. 
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Appendix: Proofs of theorems 

Theorem 3. Let T> be a database scheme, D be an instance of T), AC be a set of 
aggregate constraints on T> and qbe a query over D. Deciding whether g*''* (D) = true 
is n2-complete. 

Proof. (Membership) Membership in ilf can be proved by reasoning as for Theorem 1 , 
by exploiting a resuh similar to that of Lemma 1 (it can be proved that if there is a repair 
p s.t. q{p{D)) is true, then there is a repair p' having polynomial size w.r.t. q and D s.t. 
A(p') C X{p) ). 

(Hardness) Hardness can be proved by showing a reduction from the following impli- 
cation problem in the context of propositional logic over a finite domain V, which was 
shown to be ilf -complete in [10]: "given an atomic knowledge base T = {oi, . . . , a„}, 
where ai , . . . , a„ are atoms ofV, an atom Q GT and a formula p on V, decide whether 
Q is derivable from every model in T og p", where T og pis the updated (or revised) 
knowledge base according to the Satoh's revision operator. 

Informally, Satoh's revision operator 05 selects the models of p that are "closest" to 
models of T: closest models are those whose synmietric difference with models of T is 
minimal under set-inclusion semantics. In order to define formally the semantics of 05 
we first introduce some preliminaries. Let Mod{p) be the set of models of a formula 
p. Let A™"(T,p) = minc{{MAM' : M G Mod{p), M' e Mod{T)}), that is the 
family of C -minimal sets obtained as symmetric difference between models of p and 
T. The semantics of Satoh's operator (i.e. the set of models of the knowledge base T 
revised according to the formula p) is defined as follows: 

Mod,{T osp)^{M e Mod{p) : 3M' e Mod{T) s.t. MAM' e A™"(r,p)}. 

In the following the set of atoms occurring in p will be denoted as V{p). TJf- 
completeness of the implication problem was shown to hold also iiV{p) C T [10]: 
we consider this case in our proof. Observe that the definition of 05 entails that for each 
M G A*"^" (T, p) it holds that M C T n y (p), thus M is a subset of T. 

We now consider an instance < T,p,Q > of implication problem, where T is the 
atomic knowledge base {ai, . . . , a„}, p is a propositional formula (with V{p) C T), 
and Q is an atom in T. 

Let Cp be a boolean circuit equivalent to p. We consider the database scheme V 
introduced in the hardness proof of Theorem 1. Moreover, we consider an instance D 
which is the translation of Cp obtained in the same way as Theorem 1, except that: 

- relation input must contain not only the tuples corresponding to the inputs of Cp 
(i.e. the atoms in V(p)), but also the tuples corresponding to the atoms of T\V{p); 

- for each tuple inserted in relation input, attribute Val is set to 1, which means 
assigning true to all the atoms of T. 

Recall that measure attributes in the tuples of relations gate and gatelnput are set to 
— 1 (corresponding to an undefined truth value). 

Let AC be the same set of constraints used in the proof of Theorem 1 . As explained 
in the hardness proof of Theorem 1, AC defines the semantics of Cp and requires that 
Cp is true. Note that every repair p for D w.rt. AC must update all measure attributes 
that initially are set to —1 in D. Therefore, given two repairs p and p', they differ only 
on the set of atomic updates performed on relation input. 
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Obviously, every sef-minimal repair of p for D w.r.t. AC corresponds to a model 
M in Mod{T og p), and vice versa. In fact, given a sef-minimal repair p for D w.r.t. 
AC, a model M foiTosp can be obtained from the repaired database considering only 
the tuples in relation input where attribute Val is equal to 1 after applying p. Observe 
that the set of atoms M corresponding to p is a model T og p, otherwise there would 
exist M' C M with M' £ Mod(T og p), and the repair p' corresponding to M' would 
satisfy p' <set P> thus contradicting the minimaUty of p. Likewise, it is easy to see that 
any model in M od{T og p) corresponds to a minimal repair for D w.r.t. AC. 

Finally consider the query q = input{id{Q) , 1). The above considerations suffice 
to prove that Q is derivable from every model in Mod{T og p) iff input{id{Q), 1) is 
true in p{D) for every sef-minimal repair p for D w.r.t. AC, that is the consistent answer 
of input{id{Q), 1) on D w.r.t. AC is true. □ 

Theorem 4. Let V be a database scheme, D be an instance of T>, AC be a set of 

aggregate constraints on V and q be a query over D. Deciding whether q'^'^'^^D) = 
true is A2[log n]-complete. 

Proof. (Membership) Membership in A2[log n] derives from the fact that repairs on D 
can be partitioned into the two sets T and F consisting of all repairs pi s.t. q{pi{D)) = 
true and, respectively, q{pi{D)) = false. Let MinSize{T) = TOmpgT(|A(p)|), and 
MinSize(F) = mmpgi^(|A(p)|). It can be shown that q'^'^^'^{D) = true iff Min- 
Size{T) < MinSize{F). Both MinSize{T) and MinSize{F) can be evaluated by a loga- 
rithmic number of NP-oracle invocations. 

{Hardness). Hardness can be proved by showing a reduction from the following im- 
plication problem in the context of prepositional logic over a finite domain V: ''given 
an atomic knowledge base T onV, a formula Q on T and a formula p on V, decide 
whether Q is derivable from every model in T ojj p", where T o^, p is the updated 
(or revised) knowledge base according to the Dalal's revision operator. A^\log n\- 
completeness of this problem was shown in [10]. 

The semantics of Dalal's revision operator is as follows. The models ofTo^p are 
the models of p whose symmetric difference with models of T has minimum cardinality 
w.r.t. all other models of p. More formally, let | A'"*"(r,p)| = min{ \MAM'\ : M e 
Mod{p), M' e Mod{T)}, that is the minimum number of atoms in which models of 
T and p diverge. Then models ofT ooP are given by: 

ModiT ODp) = {M G Mod{p) : 3M' e Mod{T) s.t. \MAM'\ e |A|™"(T,p)}. 

Consider an instance < V,T,p,Q > of the implication problem, where V is the 
finite domain of atoms, T an atomic knowledge base onV,pa formula on V, and Q a 
formula on T. Let V{p) and V{Q) denote the set of atoms of V occurring in p and Q, 
respectively. Sets T, V{p) and V{Q) can be partitioned into A, B, C, D, E, as shown 
in Fig. 1(a). 

Let Cp and Cq be two boolean circuits equivalent to p and Q, respectively. Cp 
and Cq are reported in Fig. 1(b), with their inputs. In this figure, atoms belonging to 
T, V{p) and V{Q) are represented as circles, and the two circuits are represented by 
means of triangles. In particular, inputs of Cq are the atoms 6i , . . . , 6„ of B and the 
atoms ci, . . . , Cr of C, whereas inputs of Cq are the atoms ci, ... ,Cr of C, the atoms 
di, ... ,ds of D, and the atoms ei , . . . , et of D. That is, the atoms of C are inputs of 
both and Cq. 

These circuits can be represented as an instance of the database scheme V intro- 
duced in the hardness proof of Theorem 1. In particular, we consider an instance D of 
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Fig. 1. (a) The partitioning of T, V{p), V{Q); (b) Circuits 



V which is the translation of Cp and Cq obtained in the same way as Theorem 1, except 
that: 

- relation input contains a tuple for each atom in A \J B \J C U D U E; 

- for each tuple inserted in relation input, attribute Val is set to 1 if it refers to an 
atom in T, —1 otherwise. This means assigning true to all the atoms of T, and an 
undefined truth value to atoms in E. 

Recall that measure attributes in the tuples of relations gate and gatelnput are set 
to -1. 

We consider the set of aggregate constraints AC consisting of constraints 1-5 intro- 
duced in the hardness proof of Theorem 1, plus the aggregate constraint NORV al{id{op)) = 
1, where id{op) is the identifier of the output gate of Cp. As explained in the hardness 
proof of Theorem 1, AC defines the semantics of Cp and Cq and requires that Cp is 
true. 

Note that every repair p for D w.rt. AC must update all value attributes that initially 
are assigned -1 in D. Therefore, given two repairs p and p' for D w.r.t. AC, they differ 
only on the number of atomic updates performed on the tuples of input where Val was 
set to 1 in D. 

Obviously, every card-minimal repair of p for D w.r.t. AC corresponds to a model 
M in Mod{T p), and vice versa (this can be proven straightforwardly, analogously 
to the proof of Theorem 3, where the correspondence between ief-minimal repairs for 
D and models ofTogp has been shown). 

Finally consider the query q = input{id{oQ) , 1), where oq denotes the the output 
gate of Cq. The above-mentioned considerations suffice to prove that Q is derivable 
from every model in M od{T oj^ p) iff input{id{oQ) , 1) is true in p{D) for every card- 
minimal repair p for D w.rt. AC, that is the consistent answer of input{id{oQ) , 1) on 
D w.rt. AC is true. □ 



